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Abstract. The well-known Eulerian path problem can be solved in polynomial 
time (more exactly, there exists a linear time algorithm for this problem) 3 . In this 
paper, we model the problem using a string matching framework, and then initiate 
an algorithmic study on a variant of this problem, called the (2, 1)-STRING-MATCH 
problem (which is actually a generalization of the Eulerian path problem). Then, 
we present a polynomial-time algorithm for the (2, 1)-STRING-MATCH problem, 
which is the most important result of this paper. Specifically, we get a lower bound 
of Q{n), and an upper bound of O(n^). 
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1. Introduction 

The (2,1)-STRING-MATCH problem, as it will be formulated below, 
has been frequently encountered in many areas of computer science, 
especially as the well-known Eulerian path problem (which is actually 
a particular case of the (2, 1)-STRING-MATCH problem). In this paper, 
we initiate an algorithmic study on this variant of the Eulerian path 
problem. As we shall see throughout the paper, it can be solved in 
polynomial-time using some basic graph theory concepts. Let us first 
fix some basic terminology. 

Basic notions and notation. The set of natural numbers is denoted 
by N. A multiset is a 2-uple {X, /), where X is a set and / : X — > N 
is a function. A finite and nonempty set is called alphabet. If S is an 
alphabet, then denotes the set of all strings of length n over S. 
For a string x = ai...an S S", let First{x,i) denote the substring 
cji . . . (Tj, and let Last{x, i) denote the substring an-i+i ■ ■ ■ Let U = 
(ui, . . . ,Uk) be a fc-uple. We denote by U.i the i-th component of U, 
that is, U.i = Ui for alH G {1, . . . , A;}. The 0-uple is denoted by (). If q 
is an element or an uple, and i € then we define the uples 

U <\ q and U > i by: 

- U <\q = (ui, . . .,Uk,q); 

- U \> i = {ui, . . . ,Ui-i,Ui+i, . . . ,Uk). 
© 2008 Kluwer Academic Publishers. Printed in the Netherlands. 
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If s,t G N such that 1 < t < s, then the (s, t)-STRING-MATCH 
problem is stated as follows. 

Given: An alphabet S, and a n-uple U, n>2, such that U.i G for 
all i E {1,. . . ,n}. 

Output: Does there exist a permutation p = {ji,---,jn) of the set 
{l,...,n} such that Last{U.ji,t) = First{U.ji+i,t) for all i G 
{l,...,n-l}? 

EXAMPLE 1. Let S = {a, b, c}, and let U = {ab, ac, cb, cc, ba) he a 5- 
uple. Consider the permutation p = (1, 5, 2, 4, 3) of the set {1, 2, 3, 4, 5}. 
One can verify that U.I = U.b, U.5 = U.2, U.2 = UA, and UA = U.3, 
that is, U is a "YES" instance of the (2, 1)-STRING-MATCH problem. 

Throughout the paper, we study only the (2, 1)-STRING-MATCH prob- 
lem, since it can be easily modelled using the graph theory. However, we 
consider that some of the results presented may apply to the generalized 
case as well. 



2. A Polynomial-time Algorithm for the 
(2,1)-STRING-MATCH Problem 

This section is organized as follows. First, we recall some basic defini- 
tions and reformulate the problem stated above using the graph theory. 
Then, we present two naive (and superpolynomial-time) algorithms for 
solving the (2, 1)-STRING-MATCH problem. Finally, we prove the main 
result of this paper, which gives a polynomial-time algorithm to our 
problem. 

DEFINITION 1. A pseudodigraph is a 3-uple G = {V, E, f), where V 
is the set of vertices, E C V x V is the set of edges, and {E, /) is 
a multiset. If a pseudodigraph is specified simply by G, we denote by 
V{G) (orG.l) its set of vertices, and by E{G) (orG.2) its set of edges. 

If {M,g) is a multiset such that M Q E and g{e) < /(e) for all e G 
M, then the subpseudodigraph of G induced by the multiset {M,g) 
is the 3-uple H = (Vi , M, g) , where Vi = {v \ 3 e E M such that 
V G {e.l,e.2}}. 

A path in a pseudodigraph is an uple P = (ei,...,efc) of edges 
such that 6^.2 = ej+i.! for all i G {1, . . . , A; — 1}. A chain is an uple 
C = (ei, . . . , Cfc) of edges such that {e^.l, ej.2} n {ej+i.l, 6^+1.2} ^ for 
alH G {1, . . . , A; — 1}. A pseudodigraph G is called connected if any two 
of its vertices are linked by a chain in G. 
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If G = {V, E, /) is a pseudodigraph, then F+{v) = J2(v,v')&E fi'^^ ^')> 
and F-{v) = Y.{v' ,v)eE fi^' ,v). 

DEFINITION 2. Let S he an alphabet, and U a n-uple such that U.i G 
for all i G {1, . . . , n}. The pseudodigraph associated to U, denoted 

by G{U), is a pseudodigraph G{U) = {V,E,f) such that: 

— V = {v \ 3 i such that v G {First{U.i, 1), Last{U.i, 1)}}, 

— E = {(x, y) I 3 z G {1, . . . , n} such that U.i = xy}, 

— f{x,y) = \{j I U.j = xy}\ for all {x,y) G E. 
It is easy to verify that X^eeE /(^) ~ 

EXAMPLE 2. Let S = {a, 6, c, d, e, /} be an alphabet, and consider 
the 9-uple 

U = {ca, eb, ad, bf, dc, fe, ab, ah, ba) 

whose components are strings of length two overT,. Then, G{U) can be 
illustrated as in the following figure. 




Figure 1. G{U): the pseudodigraph associated to U. 

In terms of graphs, the (2, 1)-STRING-MATCH problem can be formu- 
lated as follows. 

Given: An alphabet S, a n-uple U such that U.i G for all i G 
{!,..., n}, and G{U) = {V, E, f ) the pseudodigraph associated to 
U. 

Output: Does there exist a path P = (ei, . . . , e„) in G{U) such that 
we have /(e) = |{i | ej = e}\ for all e e E7 

ALGORITHM L (A naive algorithm) Let S be an alphabet, and U 
a n-uple, n > 2, such that U.i G for all i G {1, . . . ,n}. The sim- 
plest algorithm for the (2, 1)-STRING-MATCH problem is to determine 
whether there exists a permutation p = (ji, . . . ,jn) of the set {1, . . . , n} 
such that U.ji = ?7.jj+i for all i G {I, . . . ,n — 1}. In the worst case, this 
algorithm runs in 0{n\), which is very slow. 
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ALGORITHM 2. Let us now describe a recursive algorithm, which can 

be obtained by improving the previous one. The idea is that we try to 
find a solution (that is, a permutation) progressively, by adding a new 
component to the current uple until we get an uple of length n. Note 
that the first call of the function must be StringMatchl((), ?7, 0, n). 

StringMatchl(uple S, uple R, integer t, integer n) 
1 . begin 



2. res:=''NO''\ 

3. ift = Othen 

4 . for i = 1 to n do 

5 . begin 

6 . res :=StringMatchl ((i?i), R> i,l, n); 

7. if res = ''YES" then goto 22.; 

8. end 

9 . else 

10 . for i = 1 to n — t do 

11 . if Last{S.t, 1) = First{R.i, 1) then 

12. ift = n-lthen 

13. begin 

14. res := ''YEff'- 
lb. goto 22.; 

16. end 

17. else 

18. begin 

19 . res :=StringMatchl(5 < R.i, R > i,t + l,n); 

20 . if res = ''YES" then goto 22.; 

21. end 

22 . return res; 



23. end 

R is trivial to verify that in the worst case, the time complexity of 
this algorithm is superpolynomial. Specifically, we get an upper bound of 
0{n\). We need a better algorithm. As we shall see, the (2, 1)-STRING- 
MATCH problem can be solved in polynomial time. 

Let us now prove the main result of this paper, which gives a polynomial- 
time algorithm for the (2, 1)-STRING-MATCH problem. 

THEOREM 1. Let S be an alphabet, and let U be a n-uple such that 
U.i G for all i G {1, . . . ,n}. Then U is a "YES" instance of the 
(2,1)-STRING-MATCH problem if and only if G{U) = (y,E, f) is 
connected and exactly one of the following two conditions holds: 

1. F+{v) = F-{v) for all v G V; 
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2. There exist 2 vertices vi,V2 £ V, vi V2, such that F^{vi) = 
F-{vi) + 1, F~{v2) = F+{v2) + 1, and F+{v) = F-{v) for all 

V eV - {VI,V2}. 

Proof. 

Necessity. 

First, recall that we have J^eeE f{^) = '"^ assume that U is a 

"YES" instance of the (2, 1)-STRING-MATCH problem, that is, there 
exists a path P = (ei, . . . ,e„) in G{U) such that /(e) = \{j \ ej = e}\ 
for all e £ E. 

Given that P is a path of length n, J2eeEfi^) — ^> '^^d /(e) = 
\{j I Cj = e}\ for all e E E, we get that {E,f) = {ei,...,en}. This 
implies that the set V cannot be divided into 2 disjoint sets X, Y such 
that {{x,y) \ X e X and y e Y} Ci E = i/}. Thus, we get that G{U) is 
connected. 

Since {E, /) = {ei, . . . , e„} and ej.2 = e^+i.! for all i E {1, . . . ,n — 

1}, it follows that F+{v) = F-{v) for all v G - {ei.l, e„.2}. // 
ei.l = en.2 then we have F+(ei.l) = F~(ei.l), and so we are in the 
first case. Otherwise, i/ei.l / e„.2, we get that F~^{ei.l) = F~(ei.l)+1 
and F~(e„.2) = F+(e„.2) + 1, that is, the second case. 

Sufficiency. 

1. Let us assume that G{U) is connected and we are in the first case, 
that is, F~^{v) = F~{v) for all v eV . Let Pi = (ei, . . . , Cp) he a path of 
maximal length in G{U) such that p < n and /(e) > |{j | Cj = e}\ for 
all e G E. If p = n then /(e) = |{j | ej = e}\ for all e E E, and thus, 
we conclude that U is a "YES" instance of the (2, 1)-STRING-MATCH 
problem. 

Otherwise, if p < n, it follows that there exists an edge (x, y) € E 
such that f{x,y) > \{j \ Cj = {x,y)}\. Since F'^{v) = F~{v) for all 
V €V , f{e) > \{j I Cj = e}\ for all e E E, and Pi is a path of maximal 
length, we find that ei.l = ep.2 (otherwise, if ei.l ^ ep.2, it follows 
that there exists an edge {ep.2,z) € E such that f{ep.2,z) > \{j \ ej = 
{ep.2,z)}\, that is, Pi is not a path of maximal length with the two 
properties specified above). 

Since G(U) is connected, we conclude that there exists an edge {x, y) G 
E and i G {l,...,p} such that f{x,y) > \{j \ ej = {x,y)}\ and 
X G {ej.l,ej.2}. Let x = ej.l, and denote by Gi = {Vi, Ei, fi) the 
subpseudodigraph of G{U) induced by the edges e & E with /(e) > |{i | 
Cj = e}|. Exactly, we have: 

~ Vi = {v \ 3 e such that v G {e.l, e.2} and /(e) > \{j \ ej = e}\}, 

- Ei = {e\eeE and /(e) > \{j \ ej = e}\}. 
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- /i(e) = /(e) - |{j 1 ej = e}\ for all eeEi. 

Using the fact that F~^{v) = F~{v) for all v € V, G{U) is con- 
nected, Pi is a path, and ei.l = 6^.2, we find that there exists a 
path P2 = {zi, . . . , Zq) in Gi such that z\ = {x,y), Zq.2 = x, and 
/i(e) > \{j I Zj = e}\ for all e G Ei. 

If we consider the path P = (ei, . . . , Cj-i, zi, . . . , Zq, Cj, . . . , ep), we 
find that /(e) > |{i | P-j = e}| for all e E E, andp + q < n, that is, P\ 
is not a path of maximal length in G{U). Thus, assuming that p < n, 
we get a contradiction. 

2. Let us assume that G{U) is connected and we are in the second case, 
that is, there exist 2 vertices vi,V2 & V, vi ^ V2, such that F'^{vi) = 
F-{vi) + 1, F-{v2) = F+{v2) + 1, and F+{v) = F-{v) for all v e 

V - {vi,V2}. 

Let Z = U < V2V1 be a {n + l)-uple. Then, one can easily verify that 
G{Z) is connected and F^{y) = F~{v) for all v € G{Z).l. According 
to the first case, Z is a "YES" instance of the (2, 1)-STRING-MATCH 
problem. 

If G{Z) = {Vz,Ez, fz), then let P = (ei, . . . , e^+i) be a path in 
G{Z) such that /z(e) = \{j | P-j = e}\ for all e G Ez. Also, it is 
easy to verify that ei.l = e^+i.2. Let i £ {1, . . . ,n + 1} be .such that 
&i = {,V2,vi). Then, the n-uple T = (cj+i, . . . , e^+i, ei, . . . , Cj-i) is a 
path in G{U) and /(e) = \{j \ T.j = e}\ for all e & E. Thus, U is a 
"YES" instance of the (2, 1)-STRING-MATCH problem. □ 

ALGORITHM 3. Let G = {V, E, /) be a pseudodigraph without iso- 
lated vertices, and let n = J^eesfi^)- ^ote that \V\ < 2\E\ < 2n. 
Then, the following algorithm can be successfully used to determine 
whether G is connected. Specifically, it gets as input the set E of edges, 
and returns "YES" if and only if G is connected, that is, V cannot 
be divided into two disjoint sets X,Y such that {{x,y) \ x E X and 
yeY}nE = ID. 

Connected (set E) 

1 . begin 

2. t:=0; 

3 . sets := (); 

4 . for each e G do 

5 . begin 

6 . idxl := 0; 

7. idx2:=0; 

8 . for i = 1 to i do 

9. for each x G sets.i do 
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10. begin 

11 . if X = e.l then idxl := i; 

12. if X = e.2 then idx2 := i; 

13. end 

14. if idxl = then 

15. if i<ix2 = then 

16. begin 

17. sets := sets < {e.l, e.2}; 

18. t:=t + l; 

19 . end 

20. else sets.idx2 := sets.idx2 U {e.l}; 

21 . else 

22 . if idx2 = then sets.idxl := sets.idxl U {e.2}; 

23 . else 

24. if idxl / idx2 then 

25. if idxl < idx2 then 

26 . begin 

27. M := sets.idxl U sets. idx2; 

28. sets := sets \> idx2; 

29. sets := sets > idxl; 

30. sets := sets < M; 

31. t:=t-l; 

32 . end 

33 . else 

34 . begin 

35. M := sets.idxl U sets.idx2; 

36. sets := sets > idxl; 

37. sets := sets 1> idx2; 

38. sets := sets < M; 

39. t:=t-l; 

40 . end 

41 . end 

42 . if t = l then return " YES" ; 

43. else return "iVO"; 



44. end 

One can easily remark that we get an upper bound of O(n^), and a 
lower bound ofQ{\E\). 

ALGORITHM 4. Let U be a n-uple whose components are strings of 
length two over a given alphabet. Then, the following algorithm returns 
the pseudodigraph associated to U in linear time. Exactly, the time 
complexity of this algorithm is 9{n). 
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GctPscudodigraph(uple U, integer n) 

1 . begin 

2. F:=0; 

3. E'-d); 

4 . for i = 1 to n do 

5 . begin 

6 . V --VU {FirstiU.i, 1), Last{U.i, 1)}; 

7. E := E U {{First{U.i,l), Last{U.i,l))}; 

8 . end 

9 . Let / : ^ ^ N be such that /(e) = for ah e e E; 

10. for z = 1 to n do 

11 . f{First{U.i, 1), Last{U.i, 1)) := f{First{U.i, l),Last{U.i, 1))+ 

1; 

12. return (y, £;,/); 

13. end 

ALGORITHM 5. Let U be a n-uple as in the previous algorithm, and 
consider that G{U) = {V,E,f). Note that \ V\ < 2n. Then, the following 
algorithm returns "YES" if and only if exactly one of the following two 
conditions holds: 

1. F+{v) = F-{v) for all v G V ; 

2. There exist 2 vertices v\,V2 E V, vi ^ V2, such that F~^(vi) = 
F-{vi) + I, F-{v2) = F+{v2) + 1, and F+{v) = F-{v) for all 

V eV - {VI,V2}. 

TestConditions(pseudodigraph {V, E, /)) 

1 . begin 

2 . for each v eV do 

3 . begin 

4. F+{v):=0; 

5. F-{v):=0; 

6 . end 

7 . for each e & E do 

8 . begin 

9. F+(e.l) :=F+(e.l)+/(e); 

10. F-(e.2) :=F-(e.2) + /(e); 

1 1 . end 

12. M:=0; 

13. for each v E V do 

14. if F+{v) / F-{v) then M := M U {v}; 
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15. 
16. 
17. 
18. 
19. 
20. 
21. 
22. 
23. 
24. 
25. 
26. 
27. 



if |M| = then return "YES"; 



else 



if \M\ ^ 2 then return "iVO"; 



else 
begin 



Let M = {vi,V2}; 

dl ■.= F+{vi)-F-{vi); 

d2 ■.= F+{v2)-F-{v2)] 

if dl = 1 and d2 = -\ then return ''YES'- 

else 



if dl = -1 and d2 = 1 then return "FES"; 
else return "#0"; 



end 



28. end 



Note that the time complexity of the algorithm above is 9{\E\), and 
\E\ < n, that is, the algorithm is linear. 

ALGORITHM 6. We are now ready to describe the algorithm sug- 
gested by Theorem 5. Let U be a n-uple whose components are strings 
of length two over a given alphabet. Then, the following polynomial 
algorithm returns "YES" if and only if U is a "YES" instance of the 
(2, 1)-STRING-MATCH problem. 

StringMatch2(uple U , integer n) 

1 . begin 

2. G{U) :=GctPseudodigraph(;7,n); 

3. if Connected(G(?7).2)= ''YES' then 

4. if TestConditions(G(C/))= "YES' then return "YES'; 

5. else return "NO'; 

6. else return "NO"; 

7. end 

It is easy to verify that we get an upper bound of 0{n'^), and a lower 
bound of n{n). Thus, we conclude that the (2, 1)-STRING-MATCH 
problem can be solved in polynomial time. 

EXAMPLE 3. Consider the 9-uple U = {ca, eb, ad, bf, dc, fe, ab, ab, ba). 
One can verify that G{U) is connected and TestConditions(G(?7)) re- 
turns "YES". Thus, we conclude that 9-uple U is a "YES" instance of 
the (2, 1)-STRING-MATCH problem, that is, there exists a path P = 
(ei, . . . , eg) in G{U) such that 



\{3 I = e}| = \{j I e = {First{U.j, 1), Last{U.j, 1))}| 
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for allee G{U).2. 




9 



5 




Figure 2. G{U): the pseudodigraph associated to U. 

For example, the path P = {ah, bf, fe, eb, ha, ad, dc, ca, ab) illustrated 
above leads to the same conclusion. 
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